Atom AI Labs - AI-Powered Multi-Tenant Platform

50 E2E Test Scenarios - Bug Analysis and Fixes

**Date:** 2026-02-09

**Test Script:** scripts/test_50_scenarios.py

**Results:** 72% pass rate (36/50 tests)

---

Test Results Summary

Category	Pass Rate	Status
Auth	5/5 (100.0%)	✅ Perfect
Agent	5/7 (71.4%)	⚠️ Issues
Execution	6/6 (100.0%)	✅ Perfect
Graduation	5/7 (71.4%)	⚠️ Issues
Episodes	4/5 (80.0%)	⚠️ Minor issues
Admin	2/5 (40.0%)	❌ Major issues
Errors	5/5 (100.0%)	✅ Perfect
Edge Cases	0/5 (0.0%)	❌ All failed
Performance	3/3 (100.0%)	✅ Perfect
Integration	1/2 (50.0%)	⚠️ Issues

---

Root Cause Analysis

1. HTTP 429 Errors - Quota Enforcement (Not Rate Limiting)

**Tests Affected:**

[Agent] Enforce quota limits - "Quota enforced at agent 7"
[Agent] Create agent with all capabilities - HTTP 429
[Edge Cases] Handle unicode/special chars - HTTP 429
[Edge Cases] Handle long agent names - HTTP 429
[Edge Cases] Handle invalid maturity level - HTTP 429

**Root Cause:**

The HTTP 429 errors are from QuotaManager.check_agent_quota() which raises HTTP 429 when the agent limit is reached, NOT from rate limiting.

# backend-saas/core/quota_manager.py:99
raise HTTPException(
    status_code=429,
    detail=f"Agent limit reached ({agent_count}/{limit}). Please upgrade your plan."
)

**Issue:**

Solo plan allows 10 agents
Test creates 6 agents in basic agent tests
Previous test runs may have left agents in database
No cleanup between runs causes accumulation

**Quota Limits:**

free: 3 agents
solo: 10 agents (QuotaManager.QUOTAS["solo"]["max_agents"])
team: 25 agents
enterprise: 1000 agents

**Fix Required:**

Add test cleanup to delete agents after each test run
Or use unique tenant subdomain for each test run
Or increase solo plan quota for testing

---

2. Admin Creation Response Missing Role

**Test Affected:**

[Admin] Create workspace admin - Role: N/A

**Root Cause:**

The create-admin endpoint returns TestAuthResponse (without role field) instead of AdminAuthResponse.

# backend-saas/api/routes/test_auth_routes.py:271
return TestAuthResponse(  # ❌ Missing role field
    user_id=str(user.id),
    tenant_id=str(tenant.id),
    test_token=test_token,
    email=user.email,
    name=user.name  # ❌ Should be user.first_name or user.email
)

**Fix Applied:**

Changed return to AdminAuthResponse with role field:

return AdminAuthResponse(
    user_id=str(user.id),
    tenant_id=str(tenant.id),
    test_token=test_token,
    token_type="test",
    email=user.email,
    name=user.first_name or user.email,
    role=user.role  # ✅ Now includes role
)

**Status:** ✅ FIXED

---

3. Promotion/Demotion HTTP 500 Errors

**Tests Affected:**

[Graduation] Promote agent with auth - HTTP 500
[Graduation] Demote agent with auth - HTTP 500
[Admin] Promote with JWT auth - HTTP 500
[Admin] Demote with JWT auth - HTTP 500

**Root Cause:**

The test tries to promote an agent from one tenant using an admin user from a different tenant.

# Test setup creates:
self.tenant_id = "team-plan-tenant"  # From setup_tenant()
self.admin_tenant_id = "admin-tenant"  # From setup_admin()
self.agent_id = "agent-in-team-tenant"

# Promotion test tries to:
POST /api/graduation/agents/{agent_id}/promote
Headers: {
    "Authorization": f"Bearer {self.admin_token}",  # Admin from admin-tenant
    "X-Tenant-ID": self.admin_tenant_id,  # Different tenant!
    "X-User-ID": self.admin_user_id
}

**Backend Logic:**

# backend-saas/api/routes/graduation_routes.py:369
tenant_id = await extract_tenant_id(request)  # Gets admin_tenant_id
user_id = await extract_user_id(request)

# Tries to find agent in admin_tenant_id, but agent is in team-tenant
# Returns 500 error when agent not found

**Fix Required:**

Create admin user in the same tenant as the agent
Or create a test agent in the admin tenant for promotion tests
Update test to use same tenant for both agent and admin

---

4. Episode Feedback Test Flow Issue

**Test Affected:**

[Episodes] Submit episode feedback - "Failed to create episode"

**Root Cause:**

The test tries to manually create an episode before submitting feedback, but the episode creation endpoint requires a valid execution_id.

# Test flow:
1. Create episode with POST /api/test/episodes/create  # ❌ This endpoint doesn't exist
2. Submit feedback to created episode  # Never reaches here

**Correct Flow:**

Episodes are automatically created during agent execution. The test should:

Execute an agent skill (creates episode)
Get the episode_id from execution response
Submit feedback for that episode

**Fix Required:**

Update test to use real agent execution flow:

# 1. Execute agent (creates episode)
exec_response = requests.post(
    f"{BASE_URL}/api/test/agents/{agent_id}/execute",
    json={"skill_name": "read", "params": {"query": "test"}},
    headers={...}
)
execution_id = exec_response.json()["execution_id"]

# 2. Get episode from execution
episode_response = requests.get(
    f"{BASE_URL}/api/graduation/agents/{agent_id}/episodes?limit=1",
    headers={...}
)
episode_id = episode_response.json()["episodes"][0]["id"]

# 3. Submit feedback
feedback_response = requests.post(
    f"{BASE_URL}/api/graduation/episodes/{episode_id}/feedback",
    json={"feedback_score": 0.8, "feedback_notes": "Great work!"},
    headers={...}
)

---

5. Edge Case Validation Errors

**Tests Affected:**

[Edge Cases] Handle zero episode count - HTTP 422
[Edge Cases] Handle concurrent creation - 0/3 successful

**Root Cause for Zero Episode Count:**

The readiness endpoint validates episode_count parameter:

# backend-saas/api/routes/graduation_routes.py:45
class ExamRequest(BaseModel):
    episode_count: int = Field(default=30, ge=10, le=100)  # ❌ Requires >= 10

**Fix Required:**

Test should use episode_count=10 (minimum valid value) instead of 0.

**Root Cause for Concurrent Creation:**

Concurrent agent creation requests may hit quota enforcement simultaneously before quota is updated.

**Fix Required:**

Add delays between concurrent requests
Or use sequential creation for reliability
Or handle 429 responses and retry after delay

---

Rate Limiting vs Quota Enforcement

Common Misconception

**HTTP 429** errors in this test are NOT from rate limiting:

Feature	Rate Limiting	Quota Enforcement
Source	`AbuseProtectionService.checkRateLimit()`	`QuotaManager.check_agent_quota()`
Storage	Redis (sliding window)	PostgreSQL (persistent count)
Bypass	`X-Test-Secret` header	No bypass (hard limit)
Error Code	429 (from middleware)	429 (from quota check)
Limit	Requests per minute (60-6000)	Total agents (3-1000)

Verification

The test endpoints are exempt from rate limiting:

# backend-saas/core/security/__init__.py:30
if any(path.startswith(prefix) for prefix in self.exempted_prefixes) or test_secret:
    return await call_next(request)  # Bypass rate limiting

self.exempted_prefixes = [
    "/api/test",  # ✅ Test endpoints exempt
    ...
]

But quota enforcement still applies:

# backend-saas/api/routes/test_auth_routes.py:381
QuotaManager.check_agent_quota(tenant_id, db)  # ❌ No bypass

---

Recommended Fixes

Priority 1: Test Agent Accumulation

Add cleanup function to delete all test agents after test run
Use unique tenant subdomain per run (e.g., test-{timestamp})
Add agent count logging to debug quota issues

Priority 2: Promotion Test Cross-Tenant Issue

Update test to create admin user in same tenant as agent
Or create test agent in admin tenant for promotion tests
Add tenant_id validation in promotion tests

Priority 3: Episode Feedback Test

Update test to use real agent execution flow
Get episode_id from execution response
Submit feedback for real episode

Priority 4: Edge Case Tests

Fix zero episode count to use minimum valid value (10)
Add delays between concurrent requests
Handle 429 responses with retry logic

---

Files Modified

Backend

backend-saas/api/routes/test_auth_routes.py - Fixed admin creation response to include role

Test Script (Pending)

scripts/test_50_scenarios.py - Needs updates for:
Admin/agent tenant alignment
Episode feedback flow
Edge case validation
Concurrent request handling

---

Next Steps

✅ **Fixed admin creation response** - Deploy with next deployment
⏳ **Update test script** - Fix promotion/feedback/edge case tests
⏳ **Add test cleanup** - Delete agents after each run
⏳ **Re-run tests** - Verify all fixes work correctly
⏳ **Document test patterns** - Create test development guidelines

---

Test Execution Command

python3 scripts/test_50_scenarios.py

**Expected Results After Fixes:**

Pass rate: 95%+ (up from 72%)
All admin tests: Working
All edge case tests: Working
Episode feedback: Working